Automatic Feature Selection for Sleep/Wake Classification with Small Data Sets
نویسندگان
چکیده
This paper describes an automatic feature selection algorithm integrated into a classification framework developed to discriminate between sleep and wake states during the night. The feature selection algorithm proposed in this paper uses the Mahalanobis distance and the Spearman’s ranked-order correlation as selection criteria to restrict search in a large feature space. The algorithm was tested using a leave-one-subject-out cross-validation procedure on 15 single-night PSG recordings of healthy sleepers and then compared to the results of a standard Sequential Forward Search (SFS) algorithm. It achieved comparable performance in terms of Cohen’s kappa (κ = 0.62) and the Area under the Precision-Recall curve (AUCPR = 0.59), but gave a significant computational time improvement by a factor of nearly 10. The feature selection procedure, applied on each iteration of the cross-validation, was found to be stable, consistently selecting a similar list of features. It selected an average of 10.33 features per iteration, nearly half of the 21 features selected by SFS. In addition, learning curves show that the training and testing performances converge faster than for SFS and that the final training-testing performance difference is smaller, suggesting that the new algorithm is more adequate for data sets with a small number of subjects.
منابع مشابه
Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach
Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...
متن کاملA hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts
High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...
متن کاملFuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection
Feature selection for various applications has been carried out for many years in many different research areas. However, there is a trade-off between finding feature subsets with minimum length and increasing the classification accuracy. In this paper, a filter-wrapper feature selection approach based on fuzzy-rough gain ratio is proposed to tackle this problem. As a search strategy, a modifie...
متن کاملEEG gamma frequency and sleep-wake scoring in mice: comparing two types of supervised classifiers.
There is growing interest in sleep research and increasing demand for screening of circadian rhythms in genetically modified animals. This requires reliable sleep stage scoring programs. Present solutions suffer, however, from the lack of flexible adaptation to experimental conditions and unreliable selection of stage-discriminating variables. EEG was recorded in freely moving C57BL/6 mice and ...
متن کاملAccurate Fault Classification of Transmission Line Using Wavelet Transform and Probabilistic Neural Network
Fault classification in distance protection of transmission lines, with considering the wide variation in the fault operating conditions, has been very challenging task. This paper presents a probabilistic neural network (PNN) and new feature selection technique for fault classification in transmission lines. Initially, wavelet transform is used for feature extraction from half cycle of post-fa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013